Feature normalization using smoothed mixture transformations
نویسندگان
چکیده
We propose a method for estimating the parameters of SPLICElike transformations from individual utterances so that this type of transformation can be used to normalize acoustic feature vectors for speech recognition on an utterance-by-utterance basis in a similar manner to cepstral mean normalization. We report results on an in-house French language multi-speaker database collected while deploying an automatic closed-captioning system for live broadcast news. An unusual feature of this database is that there are very large amounts of training data for the individual speakers (typically several hours) so that it is very difficult to improve on multi-speaker modeling by using standard methods of speaker adaptation. We found that the proposed method of feature normalization is capable of achieving a 6% relative improvement over cepstral mean normalization on this task.
منابع مشابه
Front-end Channel Compensation using Mixture-dependent Feature Transformations for i-Vector Speaker Recognition
State-of-the-art session variability compensation for speaker recognition are generally based on various linear statistical models of the Gaussian Mixture Model (GMM) mean super-vectors, while frontend features are only processed by standard normalization techniques. In this study, we propose a front-end channel compensation frame-work using mixture-localized linear transforms that operate befo...
متن کاملIntegrated Feature Normalization and Enhancement for robust Speaker Recognition using Acoustic Factor Analysis
State-of-the-art factor analysis based channel compensation methods for speaker recognition are based on the assumption that speaker/utterance dependent Gaussian Mixture Model (GMM) mean super-vectors can be constrained to lie in a lower dimensional subspace, which does not consider the fact that conventional acoustic features may also be constrained in a similar way in the feature space. In th...
متن کاملIntegrated Feature Normalization and Enhancement for Robust Speaker Recognition Using Acoustic
State-of-the-art factor analysis based channel compensation methods for speaker recognition are based on the assumption that speaker/utterance dependent Gaussian Mixture Model (GMM) mean super-vectors can be constrained to lie in a lower dimensional subspace, which does not consider the fact that conventional acoustic features may also be constrained in a similar way in the feature space. In th...
متن کاملAdvanced Feature Normalization and Rapid Model Adaptation for Robust In- Vehicle Speech Recognition
In this study, we present advanced feature normalization and rapid model adaptation for robust in-vehicle speech recognition. For feature normalization, we use a combination of recently established quantile-based cepstral dynamics normalization (QCN) and low pass temporal filtering (RASTALP). Similar to cepstral mean normalization (CMN), QCN aims at alleviating the mismatch between ASR acoustic...
متن کاملLikelihood normalization for face authentication in variable recording conditions
In this paper we evaluate the effectiveness of two likelihood normalization techniques, the Background Model Set (BMS) and the Universal Background Model (UBM), for improving performance and robustness of four face authentication systems utilizing a Gaussian Mixture Model (GMM) classifier. The systems differ in the feature extraction method used: eigenfaces (PCA), 2-D DCT, 2-D Gabor wavelets an...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006